Efficient Mining Maximal Subspace Differential Co-expression Patterns in Matrix Datasets: a General Earthquake Analysis Approach
نویسندگان
چکیده
The electromagnetic anomaly observations before earthquake, have been confirmed by many cases of strong earthquakes. The analysis of earthquake magnetic anomaly is an effective approach for seismo-precursor detection. Traditional frequent mining methods for electromagnetic matrix datasets analysis often find the co-related items. However, these methods may miss the items which are differential co-related patters under different datasets. Mining these differential co-related patterns is more valuable for inferring potential knowledge. In this paper, we develop an algorithm, MSPattern, to mine maximal subspace differential co-expression patterns. MSPattern constructs a weighted undirected item-item relational graph firstly. Then all the maximal co-related patterns would be mined using item-growth method in above graph. MSPattern also utilizes several techniques for producing maximal patterns without candidate patterns maintenance. Evaluated by real electromagnetic matrix datasets and the gene expression datasets, the experimental results show our algorithm can find some potential knowledge for earthquake analysis, and MSPattern algorithm is more efficient than traditional ones. The performance of MSPattern is also evaluated by empirical p-value and gene ontology, the results show our algorithm can find statistical significant and biological differential coexpression patterns.
منابع مشابه
Efficient Mining Differential Co-Expression Constant Row Bicluster in Real-Valued Gene Expression Datasets
Biclustering aims to mine a number of co-expressed genes under a set of experimental conditions in gene expression dataset. Recently, differential co-expression biclustering approach has been used to identify class-specific biclusters between two gene expression datasets. However, it cannot handle differential co-expression constant row biclusters efficiently in real-valued datasets. In this pa...
متن کاملSubspace Differential Coexpression Analysis: Problem Definition and a General Approach
In this paper, we study methods to identify differential coexpression patterns in case-control gene expression data. A differential coexpression pattern consists of a set of genes that have substantially different levels of coherence of their expression profiles across the two sample-classes, i.e., highly coherent in one class, but not in the other. Biologically, a differential coexpression pat...
متن کاملTitle: Subspace Clustering of Microarray Data based on Domain Transformation
We propose a mining framework that supports the identification of useful knowledge based on data clustering. With the recent advancement of microarray technologies, we focus our attention on gene expression datasets mining. In particular, given that genes are often coexpressed under subsets of experimental conditions, we present a novel algorithm on subspace clustering. In contrast to previous ...
متن کاملSubspace Clustering of Microarray Data Based on Domain Transformation
We propose a mining framework that supports the identification of useful knowledge based on data clustering. With the recent advancement of microarray technologies, we focus our attention on gene expression datasets mining. In particular, given that genes are often coexpressed under subsets of experimental conditions, we present a novel subspace clustering algorithm. In contrast to previous app...
متن کاملEfficient mining of distance-based subspace clusters
Traditional similarity measurements often become meaningless when dimensions of datasets increase. Subspace clustering has been proposed to find clusters embedded in subspaces of high dimensional datasets. Many existing algorithms use a grid based approach to partition the data space into nonoverlapping rectangle cells, and then identify connected dense cells as clusters. The rigid boundaries o...
متن کامل